深度学习已被广​​泛用于推断强大的掌握。虽然最初用于学习掌握配置的人类标记的RGB-D数据集,但是这种大型数据集的准备是昂贵的。为了解决这个问题,通过物理模拟器生成图像,并且使用物理启发模型(例如,抽吸真空杯和物体之间的接触型号)作为掌握质量评估度量来注释合成图像。然而,这种联系方式复杂,需要通过实验进行参数识别,以确保真实的世界表现。此外,以前的研究还没有考虑机器人可达性,例如当具有高抓握质量的掌握配置由于机器人的碰撞或物理限制而无法到达目标时无法到达目标。在这项研究中,我们提出了一种直观的几何分析掌握质量评估度量。我们进一步纳入了可达性评估度量。我们通过拟议的评估度量对模拟器中的合成图像上的综合评估标准进行注释,以培训称为抽吸贪污U-Net ++(SG-U-Net ++)的自动编码器解码器。实验结果表明,我们直观的掌握质量评估度量与物理启发度量有竞争力。学习可达性有助于通过消除明显无法访问的候选者来减少运动规划计算时间。该系统实现了560pph(每小时碎片)的整体拾取速度。
translated by 谷歌翻译
As the societal impact of Deep Neural Networks (DNNs) grows, the goals for advancing DNNs become more complex and diverse, ranging from improving a conventional model accuracy metric to infusing advanced human virtues such as fairness, accountability, transparency (FaccT), and unbiasedness. Recently, techniques in Explainable Artificial Intelligence (XAI) are attracting considerable attention, and have tremendously helped Machine Learning (ML) engineers in understanding AI models. However, at the same time, we started to witness the emerging need beyond XAI among AI communities; based on the insights learned from XAI, how can we better empower ML engineers in steering their DNNs so that the model's reasonableness and performance can be improved as intended? This article provides a timely and extensive literature overview of the field Explanation-Guided Learning (EGL), a domain of techniques that steer the DNNs' reasoning process by adding regularization, supervision, or intervention on model explanations. In doing so, we first provide a formal definition of EGL and its general learning paradigm. Secondly, an overview of the key factors for EGL evaluation, as well as summarization and categorization of existing evaluation procedures and metrics for EGL are provided. Finally, the current and potential future application areas and directions of EGL are discussed, and an extensive experimental study is presented aiming at providing comprehensive comparative studies among existing EGL models in various popular application domains, such as Computer Vision (CV) and Natural Language Processing (NLP) domains.
translated by 谷歌翻译
We propose a light-weight and highly efficient Joint Detection and Tracking pipeline for the task of Multi-Object Tracking using a fully-transformer architecture. It is a modified version of TransTrack, which overcomes the computational bottleneck associated with its design, and at the same time, achieves state-of-the-art MOTA score of 73.20%. The model design is driven by a transformer based backbone instead of CNN, which is highly scalable with the input resolution. We also propose a drop-in replacement for Feed Forward Network of transformer encoder layer, by using Butterfly Transform Operation to perform channel fusion and depth-wise convolution to learn spatial context within the feature maps, otherwise missing within the attention maps of the transformer. As a result of our modifications, we reduce the overall model size of TransTrack by 58.73% and the complexity by 78.72%. Therefore, we expect our design to provide novel perspectives for architecture optimization in future research related to multi-object tracking.
translated by 谷歌翻译
图扩散问题,例如谣言,计算机病毒或智能电网故障的传播是无处不在的和社会的。因此,根据当前的图扩散观测值鉴定扩散源通常至关重要。尽管在实践中具有巨大的必要性和意义,但作为图扩散的逆问题,源定位是极具挑战性的,因为它的规模不足:不同的来源可能导致相同的图形扩散模式。与大多数传统的来源本地化方法不同,本文着重于概率方式,以说明不同候选来源的不确定性。这样的努力需要克服挑战,包括1)很难量化图形扩散源定位的不确定性; 2)图形扩散源的复杂模式很难被概率地表征; 3)很难强加任何潜在的扩散模式下的概括。为了解决上述挑战,本文提出了一个通用框架:用于在任意扩散模式下定位扩散源的源定位变异自动编码器(SL-VAE)。特别是,我们提出了一个概率模型,该模型利用正向扩散估计模型以及深生成模型来近似扩散源分布,以量化不确定性。 SL-VAE进一步利用了对源观察对的先验知识来表征通过学识渊博的生成性先验的扩散源的复杂模式。最后,一个集成正向扩散估计模型的统一目标被得出以强制执行模型以在任意扩散模式下概括。在7个现实世界数据集上进行了广泛的实验,以证明SL-VAE在重建扩散源的优势通过在AUC分数中平均20%来重建扩散源。
translated by 谷歌翻译
将图形扩散现象的来源定位,例如错误信息传播,是一项重要但极具挑战性的任务。现有的源本地化模型通常在很大程度上取决于手工制作的规则。不幸的是,许多应用程序的图扩散过程的很大一部分仍然是人类未知的,因此拥有自动学习此类基础规则的表达模型很重要。本文旨在建立一个可逆图扩散模型的通用框架,用于在图上源定位,即可逆有效性感知图扩散(IVGD),以应对主要挑战,包括1)难以利用图形扩散模型中的知识来建模其反相反过程以端到端的方式,2)难以确保推断来源的有效性,3)源推理的效率和可扩展性。具体而言,首先,为了反向推断图形扩散源,我们提出了图形残差方案,以使现有的图形扩散模型具有理论保证。其次,我们开发了一种新颖的错误补偿机制,该机制学会抵消推断来源的错误。最后,为了确保推断资源的有效性,通过灵活地通过使用展开的优化技术来灵活地编码约束来,已经设计了一组新的有效性层层将推断为可行区域的源。提出了一种线性化技术来增强我们提出的层的效率。理论上证明了所提出的IVGD的收敛性。对九个现实世界数据集进行的广泛实验表明,我们提出的IVGD的表现明显优于最先进的比较方法。我们已经在https://github.com/xianggebenben/ivgd上发布了代码。
translated by 谷歌翻译